Skip to content

Conversation

@folkertdev
Copy link
Contributor

@folkertdev folkertdev commented Jan 18, 2026

Add simd_splat which lowers to the LLVM canonical splat sequence.

insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer

Right now we try to fake it using one of

fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}

or (in stdarch)

fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:


As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no const way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 18, 2026
@rust-log-analyzer

This comment has been minimized.

@programmerjake
Copy link
Member

this would also be useful for @rust-lang/project-portable-simd

@rust-log-analyzer

This comment has been minimized.

@folkertdev
Copy link
Contributor Author

r? @workingjubilee (or anyone else with intrinsic/simd knowledge)

@folkertdev folkertdev marked this pull request as ready for review January 19, 2026 13:08
@rustbot
Copy link
Collaborator

rustbot commented Jan 19, 2026

workingjubilee is currently at their maximum review capacity.
They may take a while to respond.

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 19, 2026
@rustbot
Copy link
Collaborator

rustbot commented Jan 19, 2026

Some changes occurred to the platform-builtins intrinsics. Make sure the
LLVM backend as well as portable-simd gets adapted for the changes.

cc @antoyo, @GuillaumeGomez, @bjorn3, @calebzulawski, @programmerjake

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri

@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 19, 2026
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me on the const_eval and codegen_ssa changes.

View changes since this review

@rust-log-analyzer

This comment has been minimized.

@folkertdev
Copy link
Contributor Author

or make ./x test locally do what CI does. Yeah this bites me basically every time ... anyway should be fixed now.

@RalfJung
Copy link
Member

RalfJung commented Jan 24, 2026

or make ./x test locally do what CI does.

It does, but different CI jobs use different bootstrap.toml values -- some have optimizations enabled, some disabled. Generally this is good for test coverage, except for tests that really care about the exact IR being emitted...

@folkertdev
Copy link
Contributor Author

I think PR CI should at least catch this, having rollups fail because of it is such a waste of time.

@RalfJung
Copy link
Member

Could you file an issue? I think I agree with Jubilee's option of just not applying noopt to codegen tests; those tests are expected to be highly sensitive to optimization flags. However, for some tests it's the opt flags of the standard library build that matter, that could be more tricky to deal with.

@folkertdev
Copy link
Contributor Author

@bors r=workingjubilee

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 24, 2026

📌 Commit 71f3442 has been approved by workingjubilee

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 24, 2026
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 24, 2026
…bilee

add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- rust-lang#60637
- rust-lang#137407
- rust-lang#122623
- rust-lang#97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 24, 2026
…bilee

add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- rust-lang#60637
- rust-lang#137407
- rust-lang#122623
- rust-lang#97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
rust-bors bot pushed a commit that referenced this pull request Jan 24, 2026
…uwer

Rollup of 3 pull requests

Successful merges:

 - #151346 (add `simd_splat` intrinsic)
 - #151353 (compiletest: Make `aux-crate` directive explicitly handle `--extern` modifiers)
 - #151571 (Fix cstring-merging test for Hexagon target)

r? @ghost
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 24, 2026
…bilee

add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- rust-lang#60637
- rust-lang#137407
- rust-lang#122623
- rust-lang#97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 24, 2026
…bilee

add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- rust-lang#60637
- rust-lang#137407
- rust-lang#122623
- rust-lang#97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
rust-bors bot pushed a commit that referenced this pull request Jan 24, 2026
…uwer

Rollup of 3 pull requests

Successful merges:

 - #151346 (add `simd_splat` intrinsic)
 - #151353 (compiletest: Make `aux-crate` directive explicitly handle `--extern` modifiers)
 - #151571 (Fix cstring-merging test for Hexagon target)

r? @ghost
@workingjubilee
Copy link
Member

@RalfJung

nikic already filed an equivalent issue I hadn't seen when I made my comment: #151580

rust-bors bot pushed a commit that referenced this pull request Jan 24, 2026
Rollup of 11 pull requests

Successful merges:

 - #149962 (Promote powerpc64-unknown-linux-musl to tier 2 with host tools)
 - #150138 (Add new Tier 3 targets for ARMv6)
 - #150905 (Fix(lib/win/net): Remove hostname support under Win7)
 - #151094 (remote-test-server: Fix compilation on UEFI targets)
 - #151346 (add `simd_splat` intrinsic)
 - #151353 (compiletest: Make `aux-crate` directive explicitly handle `--extern` modifiers)
 - #151538 (std: `sleep_until` on Motor and VEX)
 - #151098 (Add Korean translation to Rust By Example)
 - #151157 (Extend build-manifest local test guide)
 - #151403 (std: use 64-bit `clock_nanosleep` on GNU/Linux if available)
 - #151571 (Fix cstring-merging test for Hexagon target)
@rust-bors rust-bors bot merged commit 3a69035 into rust-lang:main Jan 25, 2026
11 checks passed
@rustbot rustbot added this to the 1.95.0 milestone Jan 25, 2026
rust-timer added a commit that referenced this pull request Jan 25, 2026
Rollup merge of #151346 - folkertdev:simd-splat, r=workingjubilee

add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- #60637
- #137407
- #122623
- #97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Jan 25, 2026
Rollup of 11 pull requests

Successful merges:

 - rust-lang/rust#149962 (Promote powerpc64-unknown-linux-musl to tier 2 with host tools)
 - rust-lang/rust#150138 (Add new Tier 3 targets for ARMv6)
 - rust-lang/rust#150905 (Fix(lib/win/net): Remove hostname support under Win7)
 - rust-lang/rust#151094 (remote-test-server: Fix compilation on UEFI targets)
 - rust-lang/rust#151346 (add `simd_splat` intrinsic)
 - rust-lang/rust#151353 (compiletest: Make `aux-crate` directive explicitly handle `--extern` modifiers)
 - rust-lang/rust#151538 (std: `sleep_until` on Motor and VEX)
 - rust-lang/rust#151098 (Add Korean translation to Rust By Example)
 - rust-lang/rust#151157 (Extend build-manifest local test guide)
 - rust-lang/rust#151403 (std: use 64-bit `clock_nanosleep` on GNU/Linux if available)
 - rust-lang/rust#151571 (Fix cstring-merging test for Hexagon target)
github-actions bot pushed a commit to rust-lang/stdarch that referenced this pull request Jan 26, 2026
Rollup of 11 pull requests

Successful merges:

 - rust-lang/rust#149962 (Promote powerpc64-unknown-linux-musl to tier 2 with host tools)
 - rust-lang/rust#150138 (Add new Tier 3 targets for ARMv6)
 - rust-lang/rust#150905 (Fix(lib/win/net): Remove hostname support under Win7)
 - rust-lang/rust#151094 (remote-test-server: Fix compilation on UEFI targets)
 - rust-lang/rust#151346 (add `simd_splat` intrinsic)
 - rust-lang/rust#151353 (compiletest: Make `aux-crate` directive explicitly handle `--extern` modifiers)
 - rust-lang/rust#151538 (std: `sleep_until` on Motor and VEX)
 - rust-lang/rust#151098 (Add Korean translation to Rust By Example)
 - rust-lang/rust#151157 (Extend build-manifest local test guide)
 - rust-lang/rust#151403 (std: use 64-bit `clock_nanosleep` on GNU/Linux if available)
 - rust-lang/rust#151571 (Fix cstring-merging test for Hexagon target)
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Jan 26, 2026
Rollup of 11 pull requests

Successful merges:

 - rust-lang/rust#149962 (Promote powerpc64-unknown-linux-musl to tier 2 with host tools)
 - rust-lang/rust#150138 (Add new Tier 3 targets for ARMv6)
 - rust-lang/rust#150905 (Fix(lib/win/net): Remove hostname support under Win7)
 - rust-lang/rust#151094 (remote-test-server: Fix compilation on UEFI targets)
 - rust-lang/rust#151346 (add `simd_splat` intrinsic)
 - rust-lang/rust#151353 (compiletest: Make `aux-crate` directive explicitly handle `--extern` modifiers)
 - rust-lang/rust#151538 (std: `sleep_until` on Motor and VEX)
 - rust-lang/rust#151098 (Add Korean translation to Rust By Example)
 - rust-lang/rust#151157 (Extend build-manifest local test guide)
 - rust-lang/rust#151403 (std: use 64-bit `clock_nanosleep` on GNU/Linux if available)
 - rust-lang/rust#151571 (Fix cstring-merging test for Hexagon target)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants